Southern African Origin of HTLV-1 in Romania

In Europe, most HTLV-1-infected individuals originate from highly endemic regions such as West Indies, sub-Saharan Africa, and South America. The only genuine endemic region for HTLV-1 in Europe is Romania where ATL series have been reported among Romanian patients. Our objective is to better understand the origin of this endemic focus based on a study of the genetic diversity of HTLV-1 in Romanians. DNA was obtained from PBMCs/buffy coats of 11 unrelated HTLV-1-infected individuals of Romanian origin. They include 9 ATL cases and 2 asymptomatic carriers. LTR sequences were obtained for all specimens. Complete genomic HTLV-1 sequences were obtained using four PCR series on 10 specimens. Phylogenetic trees were generated from multiple alignments using HTLV-1 prototypic sequences and the new generated sequences. Most of the complete LTR sequences (756-bp) showed low nucleotide diversity, ranging from 0% to 0.8% difference, and were closely related (less than 0.8% divergence) to the only previously characterized Romanian strain, RKI2. One strain, ROU7, diverged slightly (1.5% on average) from the others. Phylogenetic analyses both on partial LTR and the complete genome demonstrate that the 11 sequences belong to the HTLV-1a cosmopolitan genotype and 10 of them belong to the previously denominated a-TC Mozambique–Southern Africa A subgroup. In this study, we demonstrated that the HTLV-1 present in Romania most probably originated in Southern Africa. As most Romanian HTLV-1 strains are very closely related, we can assume that HTLV-1 has been introduced into the Romanian population recently. Further studies are ongoing to decipher the routes of arrival and dissemination of these HTLV-1 strains, and to date the emergence of this endemic focus in Central Europe.


Introduction
Human T-cell lymphotropic virus type 1 (HTLV-1) is a human oncoretrovirus infecting at least 5 to 10 million individuals worldwide.It has major public health implications, mainly due to its causal association with severe diseases such as adult T-cell leukemia/lymphoma (ATL) and HTLV-1-associated myelopathy/tropical spastic paraparesis (HAM/TSP) [1].Based on genetic variability, seven main HTLV-1 molecular genotypes (a to g) have been described in certain geographic areas and ethnic groups [2,3].Although the virus is highly endemic in Southern Japan, sub-Saharan Africa, the Caribbean and Australo-Melanesia, its prevalence varies significantly from one population to another [2].Moreover, HTLV-1 demonstrates remarkable genetic stability, with the observed polymorphism among viral strains correlating with the geographic origin of infected individuals.This limited genetic drift offers a molecular avenue for tracking viral transmission and the migrations of historical infected populations.In Europe, such as France, the United Kingdom or Spain, HTLV-1 is rare, except in people who have immigrated from highly endemic regions such as sub-Saharan Africa, the West Indies and South America, respectively [1,2].
Over the past 30 years, several studies indicate that Romania, a Central European country is a HTLV-1 endemic area in Europe [4][5][6][7][8][9][10].This is based, on the one hand, on studies of blood donors (mandatory screening since 1999) with a seroprevalence of up to 0.053 in first time blood donors (ten times higher than in France and the UK) [11] and, on the other hand, on reports of ATL cases, mainly sporadic ones or small series [5][6][7][8], but also recently a large series of 56 patients [9].However, no specific studies have been carried out to determine the origin of this virus.Therefore, to get new insights into this origin, we decided to study the genetic diversity of HTLV-1 present in Romanians.

Ethics statement
Formal written consent was obtained from participants in this study, which was approved by the Human Protection Committee Ile de France II (Comite ´de Protection des Personnes-CPP IdF II) registred as an Institutional Review Board (IRB registration number: 000001072) and the French Data Protection Authority (Commission Nationale de l'Informatique et des Liberte ´s-CNIL registration number: 1692254).

Studied individuals and data collection
We included eleven HTLV-1 unrelated infected individuals living in Romania, mostly suffering from ATL (Table 1).They were referred for consultation and sometimes therapeutic treatment, mainly to the hematology department of Necker-Enfants Malades hospital (APHP) in Paris.They were all born in Romania, most of them living in Bucarest and selfidentify as Caucasian.

HIV and HBV screening and blood transfusion history
All but 4 patients (ROU5, ROU9, PH523 and PH630) were tested for HIV.HBV infection status was known for only 5 individuals (ROU1, ROU2, ROU3, ROU7 and ROU9).Two individuals have reported transfusion events that took place in Romania during infancy (ROU1 and ROU8).
The 4 fragments obtained were then sequenced using 18 pairs of primers [12] and the Clus-talW algorithm (MacVector 18.6.1 software, Oxford Molecular) was implemented to align forward and reverse sequences of each segment to derive a consensus sequence of the full LTR (756-bp) region, and colinearized gag-pol-env-tax (7,567-bp) genes.

Phylogenetic analyses
For the LTR and concatenated gag-pol-env-tax genes analyses, sequences were aligned using the DAMBE program (version 4.2.13).The final alignment was submitted to the Modeltest program (version 3.6) and the best model was selected according to the Akaike information criterion.This was then applied to phylogenetic analyses using the PAUP program (version 4.0b10) to infer trees according to Neighbor-Joining (NJ) method.The tree architecture was later validated by the Maximum Likelihood method, using SeaView (version 5); the robustness of the clades was estimated by approximate likelihood-ratio test (aLRT).

LTR analyses
Phylogenetic comparison was performed on 485-nucleotide-long LTR alignment of isolates, including the 11 sequences generated in this study (in bold red) and 70 reference strains.The analysis was only carried out on 485-bp because the majority of sequences available in the LTR are not complete and this size, although reduced compared to the complete LTR (756-bp), made it possible to include a significant number of sequences from the different subgroups known in the transcontinental genotype.Two Transcontinental Japanese LTR sequences were used as outgroup (a-JPN).The phylogenetic tree was derived by the neighbor-joining method using the GTR model (gamma = 0.2884).Horizontal branch lengths are drawn to scale, with the bar indicating 0.005 nucleotide replacement per site.Numbers on each node indicate the support probability (approximate likelihood-ratio test, aLRT, calculated using the SeaView 5 software) for each cluster (Fig 1).Next to each sequence, three letters symbolize the country of origin of the infected individual (mostly IOC country codes): MOZ-Mozambic, RSA-Republic of South Africa, SWZ-Eswatini, IRN-Iran, KWT-Kuwait, SEN-Senegal, DRC-Democratic Republic of the Congo, UGA-Uganda, CIV-Co ˆte d'Ivoire, ROU-Romania, TGO-Togo, AGO-Angola, NIG-Nigeria, JPN-Japan.The topology of the phylogenetic tree was validated by Maximum Likelihood method.The groups of interest are colored as follows: grey, blue, purple, orange, green, yellow belong to Transcontinental (TC) HTLV-1a genotype, Mozambic Southern Africa (B), Mozambic-Middle East, Mozambic, Mozambic Southern Africa (A) and Japanese (a-JPN), respectively.The GenBank accession nos OR852771 to OR852781 correspond to the 11 new LTR sequences obtained from Romanian ATL patients.

Gag-pol-env-tax genes analyses
Phylogenetic analysis was derived from Neighbor-Joining method using the GTR model (gamma = 0.4815) and was confirmed using the Maximum Likelihood method.Phylogenetic tree was obtained from concatenated fragments of the 6,366-nucleotide-long gag-pol-env-tax genes, generated from 46 available complete HTLV-1 genomes comprising the sequences of genotypes a, b and c and the ten sequences generated in this work (in red bold type).Five complete Australo-Melanesian HTLV-1c sequences were used as outgroup.The topology of the tree was identical when the different genes were analyzed separately.Horizontal branch lengths are drawn to scale, with the bar indicating 0.01 nucleotide replacement per site.Numbers on each node indicate the aLRT for the supported cluster (Fig 2).The GenBank accession nos OR825456 to OR825465 correspond to the 10 new full HTLV-1 sequences obtained from Romanian ATL patients.

Results
The 11 HTLV-1 infected individuals from Romania included 9 women (mean age 39.3, range 17-77) and 2 men (mean age 37 years) (Table 1).Of the eleven patients, four had acute ATL, three had chronic ATL and two had ATL lymphoma.In addition, two were asymptomatic carriers.The proviral load-expressed as a percentage of infected peripheral blood mononuclear cells (PBMCs)-ranged from less than 1% to over 50%.We were able to test for the presence of HIV antibodies in 6 people, all of whom were negative (ROU1, ROU2, ROU3, ROU4, ROU7 and ROU8) (Table 1).Regarding hepatitis B virus status, only 5 individuals were tested: one was vaccinated (ROU1), one had antibodies indicating past infection (ROU2), two were negative (ROU3 and ROU7) and the last one was HbS carrier, indicating chronic infection (ROU9).Two patients had a history of blood transfusion performed in Romania at an early age: ROU1 was transfused during the neonatal period, and ROU8 had multiple transfusions during childhood for acute leukemia (5 years old).

Long Terminal Repeat (LTR) region analysis
The complete LTR sequence was obtained for the 11 samples from PCR fragments F1 and F4.Three sequences were identical (ROU2, ROU5 and ROU8).Most sequences displayed a low nucleotide diversity, i.e. less than 0.8% difference, and were closely related (less than 0.8% divergence) to the only previously characterized Romanian strain, RKI2 [13].One strain, ROU7, diverged slightly (1.5% on average) from the others.
Phylogenetic analysis was performed on 485-bp-long LTR alignment of isolates, including the 11 sequences generated in this study (in bold red) and 70 reference strains which represent the different subtypes within HTLV-1 transcontinental (a-TC) strains (Fig 1).These subgroups were defined and named according to previous studies based on analysis of the LTR region of strains from very different geographical origins [3,14].The analysis shows that the 11 new Romanian strains belong to the transcontinental HTLV-1a subgroup (a-TC).In addition, all but one (ROU7) belong to the a-TC Mozambique-Southern Africa A subgroup, in which Romanian RKI2 strain is also present (Fig 1).ROU7 is positioned outside this group.
The 11 new characterized sequences from Romania do not belong to the TC-Japan or TC-Middle East subgroups.

Colinearized gag-pol-env-tax genes fragment analysis
Due to limited DNA availability for sample ROU8, we were only able to generate the complete HTLV-1 proviral sequence for only 10 out of these 11 samples.Alignment of a 7523 nt long segment of the 10 genomes-the complete genome without LTRs-shows that, as with LTR, 9 strains were very close to each other: 2 are identical (ROU2 and ROU5) and the maximal nucleotide is 0.2%.One strain (ROU7) was more distant with a divergence of around 1%.
Analysis of the sequences showed that the open reading frames (ORFs) were conserved, with the exception of strain PH630, which presents a stop codon in the pol gene.
Phylogenetic tree was obtained from concatenated fragments of the 6366-nucleotide-long Gag-Pol-Env-Tax genes, generated from 46 available complete HTLV-1 genomes comprising the sequences of genotypes a, b and c and the ten new sequences generated in this work (in red bold type) (Fig 2).This phylogenetic analysis confirms that the 10 new Romanian HTLV-1 strains all belong to the HTLV-1 a-TC subgroup.Notably, 9 of the 10 Romanian strains (sample ROU7 being the exception) form a specific "Romania" clade, within the highly supported a-TC Mozambic-Southern African A subgroup (Fig 2).The topology of the tree was identical when the different genes were analyzed separately.

Discussion
In this molecular epidemiological study, we demonstrated that the HTLV-1 present in Romanians living in Romania most probably originated in Southern Africa.As most Romanian HTLV-1 strains are very closely related, and sometimes identical, we can assume that HTLV-1 has been introduced into the Romanian population fairly recently with a limited number of strains, which leads to a founder viral effect.Furthermore, analysis of the different HTLV-1 genes separately generated a similar topology of the phylogenetic tree, consistent with the fact that mutations are found scattered throughout the genome.This suggests that these viruses were not generated by recombination [15].
Concerning the presence of a stop codon in the pol gene of the proviral strain of sample PH630, it has been suggested that ATL cells are clonally selected notably by acquiring a number of genetic alterations in viral genes (gag, pol, tax) that favor escape from the host immune system [16].
ATL develops preferentially in people infected with HTLV-1 in childhood, and rarely occurs in those infected in adulthood [17].It is therefore likely that the ATL patients included in this study were infected at an early age.Interestingly, two patients had HTLV-1-seropositive mothers (PH630 and ROU7), suggesting possible mother-to-child transmission.The transfusion of two patients (ROU1 and ROU8) at an early age in Romania may explain their HTLV-1 positivity.
On the basis of the data available at present, we cannot demonstrate, or even suggest, how this virus entered Romania and then spread there.It is tempting to think that there is some similarity between the situation regarding the spread of HTLV-1 and that which existed a few decades ago in Romania for the epidemics of HIV and hepatitis B, particularly in children [18,19].However, only solid, in depth epidemiological studies specifically seeking for risk factors for HTLV-1 acquisition in the Romanian population and the families of infected individuals, combined with historical data, will be needed to try to better understand the origin of the current situation.The fact that cases of ATL appeared clinically several decades ago in patients who were then on average 40 years old (this report, 5, 9) suggests occurences of HTLV-1 infection in Romania dating back to nearly 60 years ago.It should be noted that this epidemiological work will certainly be difficult to carry out due to the retrospective aspect of these studies, which therefore concern facts that are already old, associated with public health decisions taken at the time.The recent discovery that Moldavia, a small border country to the east of Romania, also appears to be an area of high HTLV-1 endemicity, based on blood donors studies, should also encourage local public health authorities to carry out this type of epidemiological study [20].
Above all, it is essential to pursue surveillance and research efforts to limit the spread of this virus within the Romanian population.

Fig 1 .
Fig 1. Phylogenetic tree generated with neighbor-joining (NJ) method on a 485-bp fragment of the LTR region for 81 HTLV-1 available sequences including the 11 generated in this work (in bold red type).The numbers at some nodes of the tree correspond to the probability of each monophyletic groupe (a-LRT).The branch lengths are drawn to scale with bar indicating 0.005-nucleotide replacement by site.The ATK-1 and ATL-25 strains were used as outgroup.https://doi.org/10.1371/journal.pntd.0012337.g001

Fig 2 .
Fig 2. Phylogenetic tree generated with neighbor-joining (NJ) method on a 6,366-bp fragment of the HTLV-1 Gag-Pol-Env-Tax concatenated genes for 56 available sequences including the 10 generated in this work (in bold red type).The numbers at some nodes of the tree correspond to the probability of each monophyletic groupe (a-LRT).The branch lengths are drawn to scale with bar indicating 0.01-nucleotide replacement by site.Five HTLV-1c Australo-Melanesian strains were used as outgroup.https://doi.org/10.1371/journal.pntd.0012337.g002